LabCom ALAIA

Romain Contrain/ juin 30, 2021/ Current

Foreign Language Learning assisted by Artificial Intelligence

Apprentissage des Langues Assisté par Intelligence Artificielle

ALAIA is a joint laboratory between IRIT and Archean Technologies. Funded by the ANR, the French Research Agency, for 3 years, this LabCom started in March 2019. ALAIA is based on the synergy between the SAMoVA team and Archean Labs, R&D department of Archean Technologies.

Main issues and objectives

The core of ALAIA is to put computer technologies and artificial intelligence methods at the service of foreign language learning. The originality of our project is to focus on oral expression, through the evaluation of the quality of utterance pronunciation by foreign language learners, while considering the impact of their mother tongue on the target language. Our main objectives are to:

rely on the partners’ expertise in order to develop and deploy innovative software in the field of foreign language learning;
adopt a highly interdisciplinary approach based on the fields of didactics and linguistics, computer research and techniques for interaction with learners;
integrate the methods from these three domains in the development of software building blocks.

*Synergy between ALAIA’s partners, learners and experts*

We focus on the Japanese-French language pair; the former as mother tongue (L1) and the latter as target language (L2). The methodology implemented will be applied next to other language pairs. This work relies on the expertise of teachers in foreign language didactics, who are already working with ALAIA’s partners.

Current work

Within the framework of the ALAIA innovative program, our first step is to work on phonetic-phonological skills by focusing on the pronunciation of phonemes at word level. Automatic detection, localization and characterization of segmental production errors is our main focus. This relies on:

(1) Speech stimuli in French.

Based on LexPro, this dataset was developed in collaboration with the University of Waseda (S. Detey) and the University of Pau (F. Hapel and C. Domin). About 2500 statements were recorded (in standard French), orthographically and phonetically transcribed.

(2) Recordings of Japanese learners’ production (in French).

Resulting from former collaborations with the University of Waseda, a first dataset of about 15000 recordings has been valorized in the scope of ALAIA.

(3) Phonetic annotations made by experts

Over 7100 utterances based on 200 stimuli (distinct words or short sentences) produced by 67 learners, were manually transcribed at the phone level. This first annotation step was carried out thanks to specifically developed tools used by experts. About 55000 phones were labeled in terms of correctness or error type. Another expert has recently added information about temporal phone segmentation and alignment.

Thanks to this sizable dataset, we worked on acoustic modeling enabling transcription at phone level. The different steps of our process are summed up in the figure below.

*System and resources required to achieve pronunciation error detection, localization and characterization.*

(4) Acoustic modeling adaptation to japanese learners of French

More specifically, we used the phonetically annotated dataset mentioned above to adapt pre-existing acoustic-phonetic models ([Li, 2020], [Gelin, 2021]) to the domain of Japanese learners speaking French, which successfully improved the quality of its transcriptions. This model was then integrated to a tool for detecting and identifying pronunciation errors, which still in development.

(5) Error detection and characterization at lexical and syntactic levels

An industrial PhD (CIFRE) was started in January 2021 to work on linguistic levels in order to focus on lexical and syntactic errors occurring in learners’ oral production. The aim of this research work is to propose a comprehensibility measure covering both prononciation and linguistic level, applied to more spontaneous utterances which differs from word or sentence repetitions. This is a continuation of research conducted in Estelle Randria’s PhD Thesis [Randria, 2022].

LabCom Members

After Antoine Viette and Gautier Arcin who worked with us as research engineers in 2021, Romain Contrain is now in charge of our research developments on pronunciation error detection and characterization. Complementary annotations and temporal alignement were done by Sang-Ho Kim. Verdiana De Fino as industrial PhD is working on errors at higher linguistic levels. They work in close collaboration with the steering committee members (see below).

LabCom governance

The LabCom governance is organized through two committees :

Steering Committee

Isabelle Ferrané (IRIT – Co-Head) – Lionel Fontan (Archean Labs – Co-Head)
Julien Pinquier (IRIT) Thomas Pellegrini (IRIT)

Strategic Committee

Xavier Aumont: President of Archean Technologies
Jean-Pierre Jessel: Vice President of Research – University Toulouse III
Jérôme Lelasseux: Local SATT representative – TTT Toulouse Tech transfer
Jean-Marc Pierson: Head of IRIT, representing also the head of the INS2I institute of the CNRS
Charlotte Sicre: IT and liberties referent at IRIT – RGPD correspondent
Jean-Marc Fourcade: Engineering and Computer Science Officer – Region Occitanie
Olivier Baude: Pr. of Language Sciences, University of Nanterre – Director of the TGIR Huma-Num
Sylvain Detey: Pr. of Language Sciences, Waseda University – Japan – Expert in language didactics

Latest meeting: 21st November 2022

Publications related to ALAIA

Estelle Randria, Lionel Fontan, Maxime Le Coz, Isabelle Ferrané, Julien Pinquier Étude des facteurs affectant la compréhensibilité de documents multimodaux : une étude expérimentale, 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, 2020, Nancy, France. pp.534-542.
Estelle I. S. Randria, Lionel Fontan, Maxime Le Coz, Isabelle Ferrané, Julien Pinquier, Subjective Evaluation of Comprehensibility in Movie Interactions. LREC 2020: 2348-2357
Verdiana De Fino, Lionel Fontan, Julien Pinquier, Corentin Barcat, Isabelle Ferrané, Sylvain Detey, Mesures automatiques de parole non-native : exploration pilote d’un corpus d’apprenants japonais de français et différenciation de niveaux, 34èmes Journées d’Études sur la Parole (JEP 2022), Jun 2022, Noirmoutier, France. pp.1-10
Verdiana De Fino, Lionel Fontan, Julien Pinquier, Isabelle Ferrané, Sylvain Detey, Prediction of L2 speech proficiency based on multi-level linguistic features, 23rd INTERSPEECH Conference : Human and Humanizing Speech Technology (INTERSPEECH 2022), The Acoustical Society of Korea, Sep 2022, Incheon, South Korea. Proc. Interspeech 2022, 4043-4047
Verdiana De Fino, Lionel Fontan, Sylvain Detey, Isabelle Ferrané, Julien Pinquier. Corpus de parole non-native et prédiction automatique du niveau de performance en expression orale : application à CLIJAF. Journées Interphonologie du Français Contemporain (IPFC 2022), Dec 2022, Paris, France.
Sylvain Detey, Lionel Fontan, Isabelle Ferrané. From Verbo-Tonal Method teachers’ training to Computer-Assisted Pronunciation Training tools: Insight from L3 pronunciation studies and automatic speech processing technology among Japanese learners of French. 11th Speech Research (SR 2022), Faculty of Humanities and Social Sciences, Zagreb, Croatia, Dec 2022, Zagreb, Croatia

Collaborations

Sylvain Detey Waseda University – Japan – Expert in L2 didactics
- Detey, S., Fontan, L., Le Coz, M., Jmel, S., Computer assisted assessment of phonetic fluency in second language: a longitudinal study of Japanese learners of French. Speech Communication (2020) 125:69-79.
- Lionel Fontan, Shinyoung Kim, Verdiana De Fino, Sylvain Detey. Predicting speech fluency in children using automatic acoustic features. Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2022), Asia-Pacific Signal and Information Processing Association (APSIPA), Nov 2022, Chiang Mai, Thailand. pp.1086-1091
- Detey, S., Fontan, L., Le Coz, M., Barcat, C., & Kawaguchi, Y. (2022). Identifying Segmental Substitutions in Spontaneous Speech of L3-French/L1-Japanese Learners: A Corpus-based Pilot Study. Journal of the Phonetic Society of Japan, 26, 124–134
- Rubén Pérez-Ramón, Mariko Kondo, Sylvain Detey, Lionel Fontan, Maelle Amand, et al.. Nativeness and Intelligibility of Japanese accented English Consonants by French Listeners. International Congress of Phonetic Sciences (ICPHS 2023), Aug 2023, Prague, Czech Republic. pp.2581-2585.

Funding and Schedule

ANR Joint Laboratory Program ANR-18-LCV3-0 01 (FR) – 300k€
Start time: 1st March 2019 – End of project : 31st December 2023

References

[Li, 2020] Li, X., Dalmia, S., Li, J., Lee, M., Littell, P., Yao, J., … & Metze, F. (2020, May). Universal phone recognition with a multilingual allophone system. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8249-8253). IEEE.
[Gelin, 2021] Gelin, L., Daniel, M., Pinquier, J., & Pellegrini, T. (2021). End-to-end acoustic modelling for phone recognition of young readers. Speech Communication, 134, 71-84.
[Randria, 2022] Estelle Randria. Compréhensibilité de contenus audiovisuels : quelles approches pour une mesure objective ? Université Paul Sabatier (Toulouse 3), 2022. Français.